Data

We used data from the 2016 ACS for Puerto Rico to examine wage gaps between individuals with different education levels. Our research questions are: 1) How do earnings vary by education level? 2) How does the premium for education vary by gender? The 2016 ACS is a nationally representative sample of 5194. The household survey includes questions pertaining to each household member’s demographic characteristics and labor market activity.

We restrict our sample to these three racial groups: White, Black and Other. In addition, given our goal of examining earning differences by gender and marital status and the reporting of earnings in the ACS on an annual basis (wages, salary, commissions, bonuses, tips, and self-employment income during the past 12 months), we restrict our sample to full-time year-round (FTYR) workers. We define FTYR workers as individuals who report positive earnings over the past year, who worked at least 40 of the past 52 weeks, and who worked at least 35 hours per week in a usual work week over this period.

EDA Insights:

For our exploratory analysis we looked at population breakdowns by education, age, marital status, gender, race, earnings, and work hours. We applied filters on education (HS diploma or above), age (18-64), and work hours (>35/week).

An earnings histogram identified a default maximum amount of earnings (189k) which we also filtered out of the data. The earning distribution is progressive above the median, but drops off sharply below the median, likely indicating the presence of a minimum wage. The correlation between age and earnings is very weak (.23). Likewise, earnings is very weakly correlated with hours worked among those who work more than 35 hours per week. However, white individuals appear to have an earnings premium over other races, and both married and divorced individuals appear to have an earnings premium over those who have never been married. Given that the correlation between age and earnings was weak, this may be due to other qualitative factors possessed by those who get married. Married was recategoried to married, divorced and never married. Men also appear to earn a small premium over women.

The age distribution of full time workers is skewed towards older adults, possibly indicating that younger workers have trouble finding full-time work, wait to enter the workforce, or are leaving the territory.

Preliminary Econometric Estimates

First Model:

\(Earning = \beta_0 + Divorced * \beta_1 + NeverMarried * \beta_2 + Female * \beta_3 + RaceBlack * \beta_4 + RaceOther * \beta_5 +\)

\(SomeCollege * \beta_6 + Associate * \beta_7 + Bachelor * \beta_8 + Master * \beta_9 + Professional * \beta_10 + Doctoral * \beta_11 + Age * \beta_12\)

## 
## Call:
## lm(formula = PERNP ~ Divorced + NeverMarried + Female + RaceBlack + 
##     RaceOther + SomeCollege + Associate + Bachelor + Master + 
##     Professional + Doctoral + AGEP, data = ss16ppr)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -43715  -9218  -2930   5352  98732 
## 
## Coefficients:
##              Estimate Std. Error t value             Pr(>|t|)    
## (Intercept)  12561.11    1109.02  11.326 < 0.0000000000000002 ***
## Divorced     -1185.01     548.34  -2.161             0.030735 *  
## NeverMarried -2990.72     519.43  -5.758        0.00000000902 ***
## Female       -4846.95     431.25 -11.239 < 0.0000000000000002 ***
## RaceBlack    -1176.12     591.26  -1.989             0.046733 *  
## RaceOther    -2133.58     579.76  -3.680             0.000236 ***
## SomeCollege   4232.33     694.53   6.094        0.00000000118 ***
## Associate     4151.64     688.15   6.033        0.00000000172 ***
## Bachelor     12333.05     586.55  21.027 < 0.0000000000000002 ***
## Master       17780.48     812.08  21.895 < 0.0000000000000002 ***
## Professional 28122.73    1475.26  19.063 < 0.0000000000000002 ***
## Doctoral     35651.82    1615.88  22.063 < 0.0000000000000002 ***
## AGEP           286.21      21.04  13.604 < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14920 on 5181 degrees of freedom
## Multiple R-squared:  0.2465, Adjusted R-squared:  0.2448 
## F-statistic: 141.3 on 12 and 5181 DF,  p-value: < 0.00000000000000022
## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
## Call:
## lm(formula = log(PERNP) ~ Divorced + NeverMarried + Female + 
##     RaceBlack + RaceOther + SomeCollege + Associate + Bachelor + 
##     Master + Professional + Doctoral + AGEP, data = ss16ppr)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.56257 -0.30721 -0.03127  0.27682  1.69744 
## 
## Coefficients:
##                Estimate Std. Error t value             Pr(>|t|)    
## (Intercept)   9.5837348  0.0331079 289.470 < 0.0000000000000002 ***
## Divorced     -0.0401003  0.0163699  -2.450             0.014333 *  
## NeverMarried -0.1035958  0.0155069  -6.681   0.0000000000262869 ***
## Female       -0.1377478  0.0128744 -10.699 < 0.0000000000000002 ***
## RaceBlack    -0.0278243  0.0176511  -1.576             0.115005    
## RaceOther    -0.0570026  0.0173079  -3.293             0.000996 ***
## SomeCollege   0.1504187  0.0207341   7.255   0.0000000000004622 ***
## Associate     0.1546164  0.0205437   7.526   0.0000000000000612 ***
## Bachelor      0.4212902  0.0175104  24.059 < 0.0000000000000002 ***
## Master        0.5746011  0.0242433  23.701 < 0.0000000000000002 ***
## Professional  0.8184414  0.0440417  18.583 < 0.0000000000000002 ***
## Doctoral      0.9593021  0.0482395  19.886 < 0.0000000000000002 ***
## AGEP          0.0092830  0.0006281  14.780 < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4453 on 5181 degrees of freedom
## Multiple R-squared:  0.2556, Adjusted R-squared:  0.2539 
## F-statistic: 148.3 on 12 and 5181 DF,  p-value: < 0.00000000000000022
## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

  • Coefficients Explanation
    • Holding gender, race, education and age constant, married or widowed people makes $0.04 more than people who divorced or separated on average.
    • Holding gender, race, education and age constant, married or widowed people makes $0.1 more than people who never married on average.
    • Holding marriage, race, education and age constant, male makes $0.14 more than female on average.
    • Holding marriage, gender, education and age constant, White makes $0.03 more than Black on average.
    • Holding marriage, gender, education and age constant, White makes $0.06 more than Other race on average.
    • Holding marriage, gender, race and age constant, people have high school education makes $0.15 less than people have some college education on average.
    • Holding marriage, gender, race and age constant, people have high school education makes $0.15 less than people have associate education on average.
    • Holding marriage, gender, race and age constant, people have high school education makes $0.42 less than people have bachelor’s degree on average.
    • Holding marriage, gender, race and age constant, people have high school education makes $0.57 less than people have master’s degree on average.
    • Holding marriage, gender, race and age constant, people have high school education makes $0.82 less than people have Professional education on average.
    • Holding marriage, gender, race and age constant, people have high school education makes $0.96 less than people have doctor’s degree on average.
    • Holding marriage, gender, race and education constant, people make $0.01 more as age increases on average between the age of 18 to 64.

First Model Updated:

\(Earning = \beta_0 + Divorced * \beta_1 + NeverMarried * \beta_2 + Female * \beta_3 + RaceBlack * \beta_4 + RaceOther * \beta_5 +\)

\(SomeCollege * \beta_6 + Associate * \beta_7 + Bachelor * \beta_8 + Master * \beta_9 + Professional * \beta_10 + Doctoral * \beta_11 + Age * \beta_12\)

## 
## Call:
## lm(formula = PERNP ~ Widowed + Divorced + Separated + NeverMarried + 
##     Female + RaceBlack + RaceOther + SomeCollege + Associate + 
##     Bachelor + Master + Professional + Doctoral + AGEP, data = ss16ppr)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -43683  -9228  -2919   5341  98762 
## 
## Coefficients:
##              Estimate Std. Error t value             Pr(>|t|)    
## (Intercept)  12598.97    1110.28  11.348 < 0.0000000000000002 ***
## Widowed       1104.92    1888.02   0.585             0.558422    
## Divorced     -1078.74     571.98  -1.886             0.059353 .  
## Separated    -1833.17    1489.22  -1.231             0.218393    
## NeverMarried -2974.75     520.48  -5.715        0.00000001155 ***
## Female       -4868.47     432.66 -11.252 < 0.0000000000000002 ***
## RaceBlack    -1178.01     591.36  -1.992             0.046421 *  
## RaceOther    -2141.04     579.98  -3.692             0.000225 ***
## SomeCollege   4241.63     694.73   6.105        0.00000000110 ***
## Associate     4150.43     688.45   6.029        0.00000000177 ***
## Bachelor     12337.16     587.07  21.015 < 0.0000000000000002 ***
## Master       17795.40     812.61  21.899 < 0.0000000000000002 ***
## Professional 28133.37    1476.36  19.056 < 0.0000000000000002 ***
## Doctoral     35674.46    1617.08  22.061 < 0.0000000000000002 ***
## AGEP           284.94      21.12  13.488 < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14920 on 5179 degrees of freedom
## Multiple R-squared:  0.2466, Adjusted R-squared:  0.2446 
## F-statistic: 121.1 on 14 and 5179 DF,  p-value: < 0.00000000000000022
## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
## Call:
## lm(formula = log(PERNP) ~ Widowed + Divorced + Separated + NeverMarried + 
##     Female + RaceBlack + RaceOther + SomeCollege + Associate + 
##     Bachelor + Master + Professional + Doctoral + AGEP, data = ss16ppr)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.56195 -0.30715 -0.03111  0.27690  1.69792 
## 
## Coefficients:
##                Estimate Std. Error t value             Pr(>|t|)    
## (Intercept)   9.5841704  0.0331471 289.140 < 0.0000000000000002 ***
## Widowed       0.0167914  0.0563665   0.298             0.765794    
## Divorced     -0.0391735  0.0170763  -2.294             0.021829 *  
## Separated    -0.0437489  0.0444603  -0.984             0.325162    
## NeverMarried -0.1033310  0.0155387  -6.650   0.0000000000323503 ***
## Female       -0.1380595  0.0129169 -10.688 < 0.0000000000000002 ***
## RaceBlack    -0.0278297  0.0176550  -1.576             0.115017    
## RaceOther    -0.0570606  0.0173151  -3.295             0.000989 ***
## SomeCollege   0.1505241  0.0207411   7.257   0.0000000000004534 ***
## Associate     0.1546604  0.0205537   7.525   0.0000000000000619 ***
## Bachelor      0.4214137  0.0175270  24.044 < 0.0000000000000002 ***
## Master        0.5748312  0.0242603  23.694 < 0.0000000000000002 ***
## Professional  0.8187333  0.0440766  18.575 < 0.0000000000000002 ***
## Doctoral      0.9597258  0.0482778  19.879 < 0.0000000000000002 ***
## AGEP          0.0092656  0.0006307  14.692 < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4454 on 5179 degrees of freedom
## Multiple R-squared:  0.2556, Adjusted R-squared:  0.2536 
## F-statistic:   127 on 14 and 5179 DF,  p-value: < 0.00000000000000022
## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

Second Model:

\(Earning = \beta_0 + Female * \beta_1 + SomeCollege * \beta_2 + Associate * \beta_3 + Bachelor * \beta_4 +\)

\(Master * \beta_5 + Professional * \beta_6 + Doctoral * \beta_7\)

## 
## Call:
## lm(formula = PERNP ~ Female + SomeCollege + Associate + Bachelor + 
##     Master + Professional + Doctoral, data = ss16ppr)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -38674  -9557  -3538   5761 101580 
## 
## Coefficients:
##              Estimate Std. Error t value             Pr(>|t|)    
## (Intercept)   23420.3      472.7  49.544 < 0.0000000000000002 ***
## Female        -4680.9      442.5 -10.579 < 0.0000000000000002 ***
## SomeCollege    3254.4      713.8   4.559         0.0000052485 ***
## Associate      3916.5      708.4   5.529         0.0000000338 ***
## Bachelor      12148.9      603.5  20.130 < 0.0000000000000002 ***
## Master        17854.4      836.3  21.350 < 0.0000000000000002 ***
## Professional  28253.5     1521.5  18.570 < 0.0000000000000002 ***
## Doctoral      37519.8     1663.3  22.557 < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 15390 on 5186 degrees of freedom
## Multiple R-squared:  0.197,  Adjusted R-squared:  0.1959 
## F-statistic: 181.7 on 7 and 5186 DF,  p-value: < 0.00000000000000022
## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
## Call:
## lm(formula = log(PERNP) ~ Female + SomeCollege + Associate + 
##     Bachelor + Master + Professional + Doctoral, data = ss16ppr)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.37759 -0.32163 -0.03395  0.29310  1.79864 
## 
## Coefficients:
##              Estimate Std. Error t value             Pr(>|t|)    
## (Intercept)   9.93743    0.01420 699.976 < 0.0000000000000002 ***
## Female       -0.13243    0.01329  -9.966 < 0.0000000000000002 ***
## SomeCollege   0.11836    0.02144   5.521     0.00000003524980 ***
## Associate     0.14704    0.02127   6.911     0.00000000000538 ***
## Bachelor      0.41489    0.01812  22.891 < 0.0000000000000002 ***
## Master        0.57651    0.02511  22.955 < 0.0000000000000002 ***
## Professional  0.82216    0.04569  17.993 < 0.0000000000000002 ***
## Doctoral      1.01987    0.04995  20.416 < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4623 on 5186 degrees of freedom
## Multiple R-squared:  0.1972, Adjusted R-squared:  0.1961 
## F-statistic: 181.9 on 7 and 5186 DF,  p-value: < 0.00000000000000022
## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

  • Coefficients Explanation
    • Holding education constant, male makes $4656 more than female on average.
    • Holding gender constant, people have high school education makes $0.15 less than people have some college education on average.
    • Holding gender constant, people have high school education makes $0.15 less than people have associate education on average.
    • Holding gender constant, people have high school education makes $0.42 less than people have bachelor’s degree on average.
    • Holding gender constant, people have high school education makes $0.57 less than people have master’s degree on average.
    • Holding gender constant, people have high school education makes $0.82 less than people have Professional education on average.
    • Holding gender constant, people have high school education makes $0.96 less than people have doctor’s degree on average.